Variant Discovery    ◾    157

The VCF file format is the standard format for variant calling. The VCF file can be con-

verted into ANNOVAR input file by using “-format vcf4” argument.

Figure 4.17 shows the directory tree which includes “annovar” directory that contains

the ANNOVAR scripts and subdirectories and the “input” directory that includes the VCF

files (sarscov2.vcf and humanSNP.vcf) from the previous SARS-CoV-2 and human variant

calling examples. We copied them to this directory for simplicity. The following command

will convert “humanSNP.vcf” file into ANNOVAR input format “humanSNP.avinput”:

convert2annovar.pl \

-format vcf4 input/humanSNP.vcf \

> input/humanSNP.avinput

Figure 4.18 shows the ANNOVAR input file, which includes the first five essential columns

and additional three columns.

For converting other variant calling file formats, run “convert2annovar.pl -h”. This

command is also used with “-dbSNP” option to add the dbSNP accessions.

Variant annotation with ANNOVAR:

The “annotate_variation.pl” script is the core program for ANNOVAR annotation. It

requires ANNOVAR input file. However, “table_annovar.pl” script is also used for annota-

tion and it takes a VCF file as input.

./annotate_variation.pl \

-out ../output/humanSNPannot \

FIGURE 4.17  The directory tree of the ANNOVAR.

FIGURE 4.18  ANNOVAR input file.